GIS Overview

GIS Data

There are two main types of spatial data: vector and raster data.

  • Vector: Vectors (also called shapefiles) consist of points, lines and polygons. These shapes are attached to a dataframe, where each row corresponds to a different spatial element.

  • Raster: Rasters are spatially-referenced grids where each cell has one value.


Shapefile

Raster

Coordinate Reference Systems

  • Coordinate reference systems map pairs of numbers to a location on the earth.
  • Geographic Coordinate Systems live on a sphere; here, the units are in decimal degrees
  • Using the WGS84 coordinate system the World Bank MC building is located at 38.89 degrees latitude and -77.04 degrees longitude.
  • Projected Coordinate Systems project the earth onto a flat surface. Doing this distorts the earth in some way (shape, area, distance or direction)
  • Using to the World Mercator projection, the World Bank is located 4680364.64 north and -8576320.73 east.

GIS in R System

Here, we'll use the following packages

  • sp: Defines classes and methods for spatial objects.
  • rgdal: For processing spatial data, particularly for reading and writing spatial data.
  • rgeos: A vector processing library.
  • raster: Reading, writing, manipulating, analyzing and modeling of gridded spatial data.

Load and Map HH-Level Data

# Packages and Filepaths -------------------------------------------------------
library(sp)
library(rgdal)
library(rgeos)
library(raster)

hh_data <- read.csv(file.path(finalData,"HH_data.csv"))

str(hh_data)
## 'data.frame':    422 obs. of  5 variables:
##  $ X                 : int  1 3 4 5 6 7 8 10 11 12 ...
##  $ expend_food_yearly: num  250452 420029 484468 326109 131487 ...
##  $ food_security     : Factor w/ 3 levels "Little Hunger",..: 1 2 1 1 3 3 1 2 3 1 ...
##  $ longitude_scramble: num  30.4 30.4 30.4 30.4 30.4 ...
##  $ latitude_scramble : num  -1.97 -2 -1.99 -1.98 -1.95 ...

Make a Map

library(ggplot2)
ggplot() +
  geom_point(data=hh_data, 
             aes(x=longitude_scramble, y=latitude_scramble, color=food_security),
             size=.7)

Make a Better Map

hh_map1 <- ggplot() +

  # Points
  geom_point(data=hh_data, 
             aes(x=longitude_scramble, y=latitude_scramble, color=food_security),
             size=.7) +
  
  # Other elements to improve map
  coord_quickmap() + # make sure the map doesn't look distorted. Can also use
                     # coord_map(), but sometimes makes the process slow
  theme_void() +
  scale_color_manual(values=c("green", "orange", "red")) +
  labs(title="Household Food Security", color="Food Security") +
  theme(plot.title = element_text(hjust = 0.5, face="bold")) # center and bold title

hh_map1

Add a Basemap

library(ggmap)
basemap <- get_map(location = c(lon=mean(hh_data$longitude_scramble), 
                                lat=mean(hh_data$latitude_scramble)), 
                   zoom=10,
                   maptype="roadmap") # roadmap, satellite, etc. See help(get_map)

hh_map2 <- ggmap(basemap) +
  geom_point(data=hh_data, 
             aes(x=longitude_scramble, 
                 y=latitude_scramble, 
                 color=food_security),
             size=.7) +
  coord_quickmap() + # make sure the map doesn't look distorted. Can also use
                     # coord_map(), but sometimes makes the process slow
  theme_void() +
  labs(title="Household Food Security", color="Food Security") +
  theme(plot.title = element_text(hjust = 0.5, face="bold")) +
  scale_color_manual(values=c("green", "orange", "red"))

hh_map2

hh_map2 + 
  scale_x_continuous(limits = c(min(hh_data$longitude_scramble), 
                                max(hh_data$longitude_scramble)), 
                     expand = c(.03, .03)) +
  scale_y_continuous(limits = c(min(hh_data$latitude_scramble), 
                                max(hh_data$latitude_scramble)), 
                     expand = c(.03, .03))

Spatial Dataframe

So far, we've just been working with a dataframe and using the longitude and latitude variables. However, for many other uses of spatial data we need to convert the dataframe into a spatial dataframe.

  • coordinates transforms a dataframe into a spatial dataframe. The syntax is: coordinates(DATA FRAME) <- ~LONGITUDE+LATITUDE, where LONGITUDE and LATITUDE are the names of variables in DATA FRAME.
  • After creating a spatial dataframe we need to tell it what coordinate reference system (CRS) the coordinates are in. We define the crs using crs.
hh_data_sdf <- hh_data
coordinates(hh_data_sdf) <- ~longitude_scramble+latitude_scramble
crs(hh_data_sdf) <- CRS("+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0")

Spatial Dataframe Structure

Here, the variables are separated from the coordinates. Specifically, a spatial dataframe is a list, where one element of the list is a dataframe of variables and other is a dataframe of coordinates.

hh_data_sdf
## class       : SpatialPointsDataFrame 
## features    : 422 
## extent      : 30.26138, 30.67035, -2.094064, -1.934281  (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 
## variables   : 3
## names       :   X, expend_food_yearly, food_security 
## min values  :   1,              0.000, Little Hunger 
## max values  : 553,        3034121.500, Severe Hunger

Access a dataframe containing the variables (except the coordinates)

hh_data_sdf@data


Access the coordinates of the points

hh_data_sdf@coords


We can still access variables like a normal dataframe

hh_data_sdf@data$food_security
hh_data_sdf$food_security # short-cut

Quickly Plot Spatial Data

We can quickly plot spatial dataframes.

plot(hh_data_sdf)

Perform Spatial Operations

See rgeos for a list of common functions to apply on vector data.

Buffering

library(rgeos)
hh_data_sdf_buff <- gBuffer(hh_data_sdf, width= 0.5/111, byid=T) #buffer by about 10km
plot(hh_data_sdf_buff)

Reproject

head(hh_data_sdf@coords)
##   longitude_scramble latitude_scramble
## 1           30.41310         -1.965660
## 2           30.42626         -2.003319
## 3           30.42944         -1.986508
## 4           30.42331         -1.976507
## 5           30.42285         -1.952778
## 6           30.41069         -2.006962
hh_data_sdf_newproj <- spTransform(hh_data_sdf, CRS("+init=epsg:3857"))
head(hh_data_sdf_newproj@coords)
##   longitude_scramble latitude_scramble
## 1            3385571         -218859.2
## 2            3387035         -223053.9
## 3            3387390         -221181.4
## 4            3386707         -220067.4
## 5            3386657         -217424.3
## 6            3385302         -223459.7

Calculating Distances

Note: See appendix for ways to calculate distances in a more accurate way (e.g., by using an equal area projection or by taking into account the curvator of the earth)

dist_matrix <- gDistance(hh_data_sdf_newproj, byid=T)
dist_matrix[1:7,1:7]
##          1         2        3        4        5        6         7
## 1    0.000 4442.9586 2950.070 1658.693 1799.332 4608.262 4019.3068
## 2 4442.959    0.0000 1905.795 3004.456 5642.253 1779.875  850.3531
## 3 2950.070 1905.7947    0.000 1306.708 3828.044 3090.272 1193.2667
## 4 1658.693 3004.4565 1306.708    0.000 2643.513 3671.725 2442.9956
## 5 1799.332 5642.2528 3828.044 2643.513    0.000 6185.365 5019.0051
## 6 4608.262 1779.8748 3090.272 3671.725 6185.365    0.000 2486.2690
## 7 4019.307  850.3531 1193.267 2442.996 5019.005 2486.269    0.0000

Interactive Maps

So far, we've been making static maps. But those are boring. Let's make an interactive map using Leaflet. There's a bunch of different basemaps to choose from; from the list here, enter the name of the basemap in quotes in the addProviderTiles function.

library(leaflet)
library(dplyr)

imap_1 <- leaflet() %>%
  addProviderTiles("OpenStreetMap") %>% 
  addCircles(data=hh_data_sdf)

NOTE: leaflet assumes you are using the following coordinate reference system:

"+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0" 

imap_1

We can save our interactive map outside of R. Here, we use saveWidget from the htmlwidgets package. Doubleclicking the html file will open the file in any browser.

library(htmlwidgets)
saveWidget(imap_1, file=file.path(Output,"interactive_map.html"))

Lets make a better interactive map

# Define Palette
pal <- colorFactor(
  palette = c("Green","Yellow","Red"),
  domain = hh_data_sdf$food_security)

imap_2 <- leaflet() %>%
  addProviderTiles("Stamen.Terrain") %>% 
  addCircleMarkers(data=hh_data_sdf,
                 radius=3,
                 fillOpacity=1,
                 color=~pal(food_security),
                 stroke=F, # remove outer-circle around circle
                 popup=~food_security) %>% # variable to display when click feature
  # Add legend for points layer
  addLegend(pal = pal, 
          values = hh_data_sdf$food_security,
          title = "Food Security")

imap_2

Polygon Data

# Load and Prep Plot-Level Data ------------------------------------------------
setwd(file.path(finalData,"Shapefiles")) # set filepath to folder with shapefile
ag_fields <- readOGR(dsn=".", layer="allsitessubset", verbose = FALSE)

# Plot File Projection
crs(ag_fields)
## CRS arguments:
##  +proj=tmerc +lat_0=0 +lon_0=30 +k=0.9999 +x_0=500000 +y_0=5000000
## +ellps=GRS80 +units=m +no_defs
# HH Survey Location Projection
crs(hh_data_sdf)
## CRS arguments:
##  +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
# Reproject plot data to have same projection as HH survey data
hh_dd_proj <- as.character(crs(hh_data_sdf))
ag_fields <- spTransform(ag_fields, CRS(hh_dd_proj))

plot(ag_fields)

ag_fields
## class       : SpatialPolygonsDataFrame 
## features    : 19 
## extent      : 30.25664, 30.67621, -2.095173, -1.930931  (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 
## variables   : 1
## names       :        site 
## min values  :   Kayonza15 
## max values  : Rwamagana35
ag_fields@polygons Each polyon is defined by a matrix of vertices

Simplify the Shapefile

head(ag_fields)
##          site
## 0 Rwamagana33
## 1 Rwamagana33
## 2   Kayonza15
## 3   Kayonza15
## 4 Rwamagana34
## 5 Rwamagana34

You'll notice that multiple features have the same name. Here, let's simplify the polygon so that each site name represents one feature.

To aggregate (or dissolve) the polygon by the variable site, use the aggregate function from the raster package.

ag_fields <- aggregate(ag_fields, by = "site")
plot(ag_fields)

In the previous slide, you'll notice there's a weird tic mark in one of the polygons. To get rid of this, let's buffer the polygon by a very small amount – which will help cover the hole in the field.

ag_fields <- gBuffer(ag_fields, width=0.000001/111, byid=T)
plot(ag_fields)

Prep Data for Map

Before we make a map, let's merge in plot some data for each agriculture field. I've already prepared this data, but it includes average food securtiy and average food expenditure per agriculture field based on the household data.

ag_fields_data <- read.csv(file.path(finalData, "plot_data.csv"))
ag_fields <- merge(ag_fields, ag_fields_data, by="site")
summary(ag_fields@data)
##           site   expend_food_yearly_mean food_security_mean
##  Kayonza15  :1   Min.   :189101          Min.   :1.468     
##  Kayonza4   :1   1st Qu.:262529          1st Qu.:1.599     
##  Rwamagana2 :1   Median :321472          Median :1.736     
##  Rwamagana33:1   Mean   :304092          Mean   :1.726     
##  Rwamagana34:1   3rd Qu.:354403          3rd Qu.:1.898     
##  Rwamagana35:1   Max.   :384243          Max.   :1.915

ggplot can't interpret spatial dataframes. Consequently, we need to transform the data into a format that ggplot can understand. Here, we make a dataframe where each vertice of the polygon is an observation.

library(broom)

ag_fields_df <- tidy(ag_fields, region="site")
head(ag_fields_df)
##       long       lat order  hole piece       group        id
## 1 30.58359 -1.936452     1 FALSE     1 Kayonza15.1 Kayonza15
## 2 30.58364 -1.936513     2 FALSE     1 Kayonza15.1 Kayonza15
## 3 30.58364 -1.936496     3 FALSE     1 Kayonza15.1 Kayonza15
## 4 30.58366 -1.936071     4 FALSE     1 Kayonza15.1 Kayonza15
## 5 30.58366 -1.936071     5 FALSE     1 Kayonza15.1 Kayonza15
## 6 30.58366 -1.936032     6 FALSE     1 Kayonza15.1 Kayonza15

The resulting dataframe from tidy:

  • Has the variable id, which is taken from the the variable in the region argument.
  • Doesn't have any of our other variables (e.g., food security).
  • We can merge our other variables (in ag_fields) into this dataframe using the site variable from ag_fields and the id variable from ag_fields_df.

ag_fields_df <- merge(ag_fields_df, ag_fields, by.x="id", by.y="site")
head(ag_fields_df)
##          id     long       lat order  hole piece       group
## 1 Kayonza15 30.58359 -1.936452     1 FALSE     1 Kayonza15.1
## 2 Kayonza15 30.58364 -1.936513     2 FALSE     1 Kayonza15.1
## 3 Kayonza15 30.58364 -1.936496     3 FALSE     1 Kayonza15.1
## 4 Kayonza15 30.58366 -1.936071     4 FALSE     1 Kayonza15.1
## 5 Kayonza15 30.58366 -1.936071     5 FALSE     1 Kayonza15.1
## 6 Kayonza15 30.58366 -1.936032     6 FALSE     1 Kayonza15.1
##   expend_food_yearly_mean food_security_mean
## 1                189101.4           1.914634
## 2                189101.4           1.914634
## 3                189101.4           1.914634
## 4                189101.4           1.914634
## 5                189101.4           1.914634
## 6                189101.4           1.914634

Now, let's make a map

ggplot() + 
  geom_polygon(data = ag_fields_df, aes(x = long, y=lat, group=group, 
                                    fill=expend_food_yearly_mean))

Let's make a better map

ag_map1 <- ggplot() + 
  geom_polygon(data = ag_fields_df, aes(x = long, y=lat, group=group, 
                                    fill=expend_food_yearly_mean),
               color="black", size=0.25) + # Borders of polygons
  
  coord_quickmap() + # Make sure map doesn't look distorted
  theme_void() + # Set theme of plot 
  scale_fill_gradient(low = "orangered1", high = "green1") + # Color Gradient
  labs(fill="Food\nExpenditure", title="Annual Food Expenditure") + # Labels
  theme(plot.title = element_text(hjust = 0.5, face="bold")) # Center title

ag_map1

Lets use a different color palette

library(RColorBrewer)

ag_map2 <- ggplot() + 
  geom_polygon(data = ag_fields_df, aes(x = long, y=lat, group=group, 
                                    fill=expend_food_yearly_mean),
               color="black", size=0.25) + # Borders of polygons
  
  coord_quickmap() + 
  theme_void() + 
  scale_fill_distiller(palette = "Spectral", direction = 1) + # Color Gradient
  labs(fill="Food\nExpenditure", title="Annual Food Expenditure") + 
  theme(plot.title = element_text(hjust = 0.5, face="bold")) 

ag_map2

ggplot colors scale summary

Defing own color scale

  • scale_*_gradient(low, high): [For continuous variable] Manually define low/high colors
  • scale_*_gradientn(colors): [For continuous variable] Use a defined list of colors (e.g., c("purple","blue","yellow","white"))
  • scale_*_manual(colors): [For discrete variable] Use a defined list of colors, where the list is equal to the number of unique observations in the discrete variable.

Using pre-defined palettes

  • scale_*_distiller(palette): [For contiuous variables]
  • scale_*_brewer(palette): [For discrete variables]

Where * is either color or fill. Discrete variables = factor variables.

Add text to map

Now, let's add text to the map showing regions. To do this, we need to create a dataframe that includes the lat/lon of the center of each plot and a variable for the site name. To do this, we use:

  • gCentroid() which returns a shapefile of the center of each region
  • coordinates which returns a matrix of the coordinates of a shapefile, where the first column is longitude and the second column is latitude.
# Create dataframe of center of each plot with site name as variable
ag_fields_center <- gCentroid(ag_fields, byid=T) 
ag_fields_center <- as.data.frame(coordinates(ag_fields_center))

ag_fields_center$site <- ag_fields$site
names(ag_fields_center) <- c("longitude","latitude","site")

Now, lets use geom_text to add text to our map.

ag_map2 + geom_text(data=ag_fields_center, aes(label=site, x=longitude, y=latitude),
            check_overlap = TRUE) # makes sure text doesn't overlap

If you have lots of cluttered text, use geom_text_repel from ggrepel.

library(ggrepel)

ag_map2 + geom_text_repel(data=ag_fields_center, aes(label=site, x=longitude, y=latitude))

Let's add HH locations to the map

ag_map2 <- ggplot() + 
  geom_polygon(data = ag_fields_df, aes(x = long, y=lat, group=group, 
                                    fill=expend_food_yearly_mean),
               color="black", size=0.25) + # Borders of polygons
  geom_point(data=hh_data, 
           aes(x=longitude_scramble, 
               y=latitude_scramble, 
               color="HH Location"), # Name an aesthetic want you want to 
                                     # appear on legend
           size=.1, alpha=.6) +
  scale_color_manual(values="black") + # Manually define color of points
  coord_quickmap() + 
  theme_void() +
  scale_fill_gradient(low = "orangered1", high = "green1") + 
  labs(fill="Food\nExpenditure", title="Annual Food Expenditure", 
       color="") + # Setting color="" makes the title above the legend item blank
  theme(plot.title = element_text(hjust = 0.5, face="bold")) 

ag_map2

Add Polygon to Interactive Map

leaflet() %>%
  addProviderTiles("OpenStreetMap") %>% 
  addPolygons(data=ag_fields)

pal_foodexpend <- colorNumeric("RdYlGn", ag_fields$expend_food_yearly_mean)

imap_3 <- leaflet() %>%
  addProviderTiles("OpenStreetMap") %>%
  addPolygons(data=ag_fields,
            fillColor = ~pal_foodexpend(expend_food_yearly_mean),
            fillOpacity = 0.6,
            color="black", # color of line around polygons
            weight=1, # width of line around polygons
            popup=~site)

imap_3

Map Using Admin Data

Now, let's make a map using administrative-level data. The get_data function from the raster package allows us to download data from GADM (the Database of Global Administrative Areas). Let's download the third administrative division for Rwanda.

rwa_adm <- getData('GADM', country='RWA', level=3)
rwa_adm
## class       : SpatialPolygonsDataFrame 
## features    : 422 
## extent      : 28.86171, 30.89907, -2.839973, -1.04745  (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 
## variables   : 16
## names       : OBJECTID, ID_0, ISO, NAME_0, ID_1,           NAME_1, ID_2,    NAME_2, ID_3, NAME_3, CCN_3, CCA_3,     TYPE_3,  ENGTYPE_3, NL_NAME_3, ... 
## min values  :        1,  189, RWA, Rwanda,    1,     Amajyaruguru,    1,  Bugesera,    1,   Base,     0,     0,     Sector,     Sector,          , ... 
## max values  :      422,  189, RWA, Rwanda,    5, Umujyi wa Kigali,   30, Rwamagana,  422,   Zaza,  5715,  5715, Water body, Water body,          , ...

head(rwa_adm)
##   OBJECTID ID_0 ISO NAME_0 ID_1       NAME_1 ID_2 NAME_2 ID_3  NAME_3
## 1        1  189 RWA Rwanda    1 Amajyaruguru    1 Burera    1  Bungwe
## 2        2  189 RWA Rwanda    1 Amajyaruguru    1 Burera    2  Butaro
## 3        3  189 RWA Rwanda    1 Amajyaruguru    1 Burera    3 Cyanika
## 4        4  189 RWA Rwanda    1 Amajyaruguru    1 Burera    4   Cyeru
## 5        5  189 RWA Rwanda    1 Amajyaruguru    1 Burera    5 Gahunga
## 6        6  189 RWA Rwanda    1 Amajyaruguru    1 Burera    6  Gatebe
##   CCN_3 CCA_3 TYPE_3 ENGTYPE_3 NL_NAME_3 VARNAME_3
## 1  4401  4401 Sector    Sector                    
## 2  4402  4402 Sector    Sector                    
## 3  4403  4403 Sector    Sector                    
## 4  4404  4404 Sector    Sector                    
## 5  4405  4405 Sector    Sector                    
## 6  4406  4406 Sector    Sector

I've preppared a dataset of average vegetation levels per administrative unit in Rwanda. Here, 0 means no vegetation and 1 means significant vegetation.

rwa_veg <- read.csv(file.path(finalData, "rwanda_vegetation.csv"))
head(rwa_veg)
##   X OBJECTID vegetation
## 1 1        1      0.542
## 2 2        2      0.602
## 3 3        3      0.677
## 4 4        4      0.604
## 5 5        5      0.591
## 6 6        6      0.546

Exercise

Make an administrative-level map of vegetation levels. Here's a broad outline of steps you'll need to implement:

  • Merge the vegetation data with the Rwanda spatial dataframe.
  • Convert the spatial dataframe into a dataframe that is interpretable by ggplot
  • Make the map using ggplot.
  • Note: There are some NA values; think about effective ways to map those.
  • Hint: If your computer is talking a long time to plot, run the below code which simplifies the polygon (i.e., removes some of the vertices to simplify the shape). Larger values of tol make the polygon more simplified. I played around with values and found 0.003 to be adequate. Note that large values of tol might reduce the number of polygons!
rwa_adm_simple <- gSimplify(rwa_adm, tol=0.003) # simplifies the polygon

# Resulting layer from gSimplify has no variables, so need to add a variable back in
rwa_adm_simple$OBJECTID <- rwa_adm$OBJECTID 

Solution

# 1. Merge data together -------------------------------------------------------
rwa_adm_simple <- merge(rwa_adm_simple, rwa_veg, by="OBJECTID")

# 2. Make dataframe interpretable by ggplot ------------------------------------
rwa_adm_df <- tidy(rwa_adm_simple, region="OBJECTID")
rwa_adm_df <- merge(rwa_adm_df, rwa_adm_simple, by.x="id", by.y="OBJECTID")

# 3. Make Plot -----------------------------------------------------------------
veg_map <- ggplot() +
  geom_polygon(data=rwa_adm_df, aes(x=long, y=lat, group=group), 
               fill="white", color="black",size=.1) +
  geom_polygon(data=rwa_adm_df[!is.na(rwa_adm_df$vegetation),], 
             aes(x=long, y=lat, group=group, fill=vegetation),
             color="black",size=.1) +
  theme_void() +
  scale_fill_distiller(palette="RdYlGn", direction=1) +
  labs(fill="Vegetation", title="Vegetation in Rwanda") +
  theme(plot.title = element_text(hjust = 0.5, face="bold"))

veg_map

APPENDIX

Raster Files

Satellite imagery often comes in multiple bands. Each band captures a different part of the electromagnetic spectrum. Different combinations of bands are used to create spectral indices which are used to highlight different land cover elements, such as vegetation, built-up area, or burnt areas.

\[NDVI = \frac{NIR-Red}{NIR+Red}\]

Some common satellites include

  • Landsat: Captures images across the earth every 16 days at a 30 meter resolution across multiple spectral bands. Landsat began collecting images in 1972 and continues to the present.

  • Sentinel: Captures images across the earth every 5 days (at the equator) at a 10 meter resolution across multiple sepctral bands. Began in 2014 and continues to the present.

  • MODIS: Captures daily to bi-weekly imagery ranging from a 250m to 500m resolution.

  • VIIRS: Monthly nighttime lights at a 750m resolution from April 2012 to present.

Load and Plot NDVI Around Rwanda

ndvi_2012_a <- raster(file.path(finalData, "Tiffs", "ndvi_2012_0203.tif"))
plot(ndvi_2012_a) 
plot(ag_fields,add=T)

Crop Image to Agricultural Fields

ndvi_2012_a <- crop(ndvi_2012_a,  ag_fields)
plot(ndvi_2012_a)
plot(ag_fields, add=T)

Mask Image to Agricultural Fields

ndvi_2012_a <- mask(ndvi_2012_a,  ag_fields)
plot(ndvi_2012_a)

Plot Raster Using ggplot

You can also plot rasters using ggplot. Here, we need to convert the raster into a dataframe with latitude, longitude and the values. Then, we use geom_tile.

ndvi_2012_a_df <- as(ndvi_2012_a, "SpatialPixelsDataFrame")
ndvi_2012_a_df <- as.data.frame(ndvi_2012_a_df)

ggplot() +
  geom_tile(data=ndvi_2012_a_df, aes(x=x, y=y, fill=ndvi_2012_0203),alpha=1) +
  scale_fill_gradient(low="red",high="green") +
  coord_quickmap() + 
  theme_void() +
  labs(fill="NDVI", title="NDVI")

Plot Raster Using rasterVis

We can also plot the raster using other packages that don't require first converting the object to a dataframe. Here, we use rastervis.

library(rasterVis)

cols <- colorRampPalette(brewer.pal(9,"RdYlGn"))

levelplot(ndvi_2012_a,
          col.regions = cols)

Make better map

levelplot(ndvi_2012_a,
          col.regions = cols,
          margin=F, # No histograms on side
          par.settings=
             list(axis.line=list(col='transparent')), # No box around image
          scales=list(draw=FALSE), # No latitude and longitude markers
          xlab="",
          ylab="",
          main="NDVI 2012 (Season A)",
          colorkey=list(space="right"), # legend location. can change to bottom,
                                        # left and top. If you don't want a 
                                        # legend, use: colorkey = F
          maxpixels=1e6 # sometimes blends pixels together
          ) +
  layer(sp.polygons(ag_fields, 
                    col="black", # border color
                    lwd=1, # line width
                    alpha=1))

Raster Arithmetic

Note that NDVI goes from -1 to 1; MODIS scales this value by 10000 so we'll scale this back to -1 to 1 by multiplying values by 0.0001. There are two ways to do arithmetic on rasters:

raster_new <- raster * 0.0001

OR

raster_new <- calc(raster, fun = function(x) x * 0.0001)

The second way looks ugly, but it is more efficient.

Plot Multiple Images

library(rasterVis)
library(RColorBrewer)

ndvi_2012_a <- raster(file.path(finalData, "Tiffs", "ndvi_2012_0203.tif")) %>%
  crop(ag_fields) %>%
  mask(ag_fields) %>%
  calc(fun=function(x) x*0.0001)
ndvi_2012_b <- raster(file.path(finalData, "Tiffs", "ndvi_2012_0910.tif")) %>%
  crop(ag_fields) %>%
  mask(ag_fields) %>%
  calc(fun=function(x) x*0.0001)
ndvi_2016_a <- raster(file.path(finalData, "Tiffs", "ndvi_2016_0203.tif")) %>%
  crop(ag_fields) %>%
  mask(ag_fields) %>%
  calc(fun=function(x) x*0.0001)
ndvi_2016_b <- raster(file.path(finalData, "Tiffs", "ndvi_2016_0910.tif")) %>%
  crop(ag_fields) %>%
  mask(ag_fields) %>%
  calc(fun=function(x) x*0.0001)

ndvi_stack <- stack(ndvi_2012_a, ndvi_2012_b, 
                    ndvi_2016_a, ndvi_2016_b)

cols <- colorRampPalette(brewer.pal(9,"RdYlGn"))

levelplot(ndvi_stack,
          main="NDVI Across Seasons",
          col.regions=cols,
          layout=c(2, 2))

Interactive Map with Rasters

pal <- colorNumeric(c("red3", "yellow", "palegreen4"), values(ndvi_stack),
                    na.color = "transparent")

NDVI_interactive <- leaflet() %>% 
  addProviderTiles("OpenStreetMap") %>%
  addRasterImage(ndvi_2012_a, colors = pal, opacity = 1, group="2012 A") %>%
  addRasterImage(ndvi_2012_b, colors = pal, opacity = 1, group="2012 B") %>%
  addRasterImage(ndvi_2016_a, colors = pal, opacity = 1, group="2016 A") %>%
  addRasterImage(ndvi_2016_b, colors = pal, opacity = 1, group="2016 B") %>%
  addLegend(pal = pal, values = values(ndvi_stack),
    title = "NDVI") %>%
  addLayersControl(
    overlayGroups = c("2012 A","2012 B","2016 A","2016 B"),
    options = layersControlOptions(collapsed = FALSE))

NDVI_interactive

Summarize NDVI By Plot

Let's say we want to determine average NDVI values for each plot or the NDVI value under each household location. To do this, we use extract from the raster package.

# Average NDVI Value per Plot
ag_fields$NDVI_2012_a <- extract(ndvi_2012_a, ag_fields, fun=mean)

# NDVI Value per HH Location
hh_data_sdf$NDVI_2012_a <- extract(ndvi_2012_a, hh_data_sdf)

Sometimes this process can be slow. To (dramatically) increase the speed of this task, use the velox package. This is built to work with spatial polygons (not points).

  • velox(raster) converts the raster object to a velox raster object. From the raster object you can use the extract method, where the inputs are the shapefile and summarize function.
library(velox)
ag_fields$ndvi_2012_a_velox <- velox(ndvi_2012_a)$extract(ag_fields, fun=mean)

subset(ag_fields@data, select=c(site, NDVI_2012_a, ndvi_2012_a_velox))
##          site NDVI_2012_a ndvi_2012_a_velox
## 1   Kayonza15   0.4951754         0.4951754
## 2    Kayonza4   0.5363880         0.5363880
## 3  Rwamagana2   0.5958732         0.5958732
## 4 Rwamagana33   0.5472275         0.5472275
## 5 Rwamagana34   0.5667049         0.5667049
## 6 Rwamagana35   0.5793644         0.5793644

Useful Projections

Geographic Coordinate System

The most common geographic coordinate system is the World Geodetic System (WGS84).

"+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0"

Projected Coordinate Systems

Below are projections for equal area and equal distance. Replace [LATITUDE VALUE] and [LONGITUDE VALUE] with the center of the area that you're working with.

Equal Area: Lambert Azimuthal Equal Area

"+proj=laea +lat_0=[LATITUDE VALUE] +lon_0=[LONGITUDE VALUE]"

Equal Distance: Azimuthal equidistant projection

"+proj=aeqd +lat_0=[LATITUDE VALUE] +lon_0=[LONGITUDE VALUE]"

Calculating Distances: Way 1

A common task in GIS is to calculate distances. Here, let's calculate the distance of each household location to the nearest city (using a dataset of Rwanda's 3 largest cities). This same syntax can be used to calculate the shortest distance to polygons (e.g., a lake) or polylines (e.g., roads).

Import city-level dataset

rwa_cities <- read.csv(file.path(finalData, "rwa_cities.csv"))
coordinates(rwa_cities) <- ~lon+lat
crs(rwa_cities) <- CRS("+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0")

Re-project to equal distant projection

# Here, I used Kigali as the center
eq_dist_projection <- "+proj=aeqd +lat_0=-1.943889 +lon_0=30.059444"

rwa_cities_ed <- spTransform(rwa_cities, CRS(eq_dist_projection))
hh_data_sdf_ed <- spTransform(hh_data_sdf, CRS(eq_dist_projection))

Creates distance matrix

gDistance(rwa_cities_ed, hh_data_sdf_ed, byid=T) %>% head
##          1         2        3
## 1 39419.69 101785.29 74845.93
## 2 41335.03 100056.47 75709.40
## 3 41433.12 101555.78 76291.83
## 4 40642.18 101794.90 75775.02
## 5 40443.28 103553.65 76166.05
## 6 39694.79  98492.55 73943.48

Calculate shortest distance to any city.

gDistance_cities_i <- function(i) gDistance(hh_data_sdf_ed[i,], rwa_cities_ed, byid=F)
hh_data_sdf_ed$dist_city <- lapply(1:nrow(hh_data_sdf_ed), gDistance_cities_i) %>% unlist

Calculating Distances: Way 2

There are also ways to calculate distances by using a geographic coordinate system (ie, keeping the world a sphere) and taking into account the curvator of the earth. Haversine (good for short distances – think the distance of a short haul flight) and vincenty (slower but better for longer distances) are common formulas to calculate these distances.

The below code uses the haversine method to calculate the distance between all the household locations and the first Rwandan city. The output is in meters.

library(geosphere)
distHaversine(hh_data_sdf, rwa_cities[1,]) 

We can also calculate the distance between a point and a line using the dist2Line function from the geosphere package. Below shows an example between an arbitrary point and line shapefile using the haversine formula.

dist2Line(points, line, distfun=distHaversine)